Model-Directed Document Image Analysis
نویسنده
چکیده
If current OCR engineering trends continue, then, we believe, \general{purpose" systems | that is, fully automatic and nonretargetable systems | will leave many potential users unsat-issed, and lucrative application niches unnlled, for years to come. However, for users who care enough to volunteer some manual eeort | to help customize the system to their document(s) | signiicantly higher accuracy may be achievable, without delay. We discuss in detail two state{ of{the{art document recognition systems | Lu-cent Technologies' Table Reader System (TRS) and Xerox's \document image decoding" (DID) research prototype | which yield high accuracy by reliance on explicitly stated models of properties of the target document, whether iconic (known typefaces and image degradations), geometric (restricted classes of layouts), or symbolic (linguistic and pragmatic contextual constraints). How great are the performance advantages that can be realized by sacriicing automation in these ways? To what extent can the necessary customizations be (semi{)automated? We outline recent and planned research at Xerox PARC motivated by these questions. The dominant type of present{day commercial OCR system, whether on the desktop or in service{bureau settings, is designed to operate fully automatically, refusing to accept guidance 0 Invited unrefereed talk presented at the DOD-from the user. The majority of desk{top users welcome this since they are untrained and impatient with inconvenience. There is a similar reliance on more or less completely automatic operation in almost all of the highly specialized OCR application niches such as postal{code and nancial{document processing, even though their costly equipment is tended by trained staa in controlled service{bureau settings. In this case, it is largely the daunting throughput requirements that dictate fully automatic operation. Both of these user communities | the casual SOHO users and the sophisticated special-document users | tolerate surprisingly low performance. The latest competitive studies, at UNLV in 1996 1], showed, for example, that desk-top OCR packages misrecognize 3{15% of characters | an intolerably high error rate, most users would agree | in over 40% of magazine pages: for other document categories, performance was far worse. The best current systems for reading hand{written courtesy amounts on checks 2] are tuned to reject 33{55% of the input in order to hold substitution errors below 1%. Similarly, the best handwritten postal-address readers fail to \\-nalize" 35% of the input 3]. All of these technologies are improvable, of course, and are improving: but slowly and at a high cost. The UNLV data suggested …
منابع مشابه
Document Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملLearning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملPersian Printed Document Analysis and Page Segmentation
This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...
متن کاملDocument Image Retrieval Based on Keyword Spotting Using Relevance Feedback
Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...
متن کاملDocument Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)
Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...
متن کاملمقایسه اثربخشی موسیقیدرمانی با تصور هدایتشده و راهبردهای شناختی بر کاهش میزان اضطراب دانشآموزان
The present study compared the effectiveness of music therapy with directed image and cognitive strategies on reducing anxiety in high school students. Listen Read phonetically Dictionary-View detailed dictionary. In this study, it was hypothesized that music therapy with directed image is more effective than cognitive strategies approach. In order to test the research hypothesis, from all hi...
متن کامل